WordPress Shortcodes in CFML

Shortcodes are simple inline directives that can be parsed and executed, rather like CFML or JSP custom tags embedded in HTML. For example, here is the shortcode to generate the \LaTeX logo inline:

[latex block=false]\LaTeX[/latex]

If you want to see the raw \LaTeX, it's in the tooltip/title of the images. WordPress.com provides the encoding service, though you can run a local service if you want.

Demo

Here's some sample content that contains several shortcodes:

Hey, it's [time], or [time mask="HH:mm:ss tt"], or simply hour [time mask=H /].

This demo uses a port of the WordPress [latex block=false color=00f]\LaTeX[/latex] plugin, plus my "block hack", with some example formulas taken from my sun/earth collision model.

Euler's Identity (at different sizes): [latex size="-1"]e^{i \pi} + 1 = 0[/latex]

[latex size="2"]e^{i \pi} + 1 = 0[/latex]

Gravitational Constant: [latex]G= 6.673 \times 10^{-11} m^3\: kg^{-1}\: s^{-2} = 6.673 \times 10^{-11} \frac{m^3}{kg\:s^2}[/latex]

Force due to gravity: [latex]F = G \frac{M_1 M_2}{r^2}[/latex]

Inline (or non-block) also works (e.g., [latex block=false] 2 + 2 = 4[/latex]).

Passing it through the shortcode processor yields this:

Hey, it's 6:17, or 18:17:34 PM, or simply hour 18.

This demo uses a port of the WordPress \LaTeX plugin, plus my "block hack", with some example formulas taken from my sun/earth collision model.

Euler's Identity (at different sizes): e^{i \pi} + 1 = 0

e^{i \pi} + 1 = 0

Gravitational Constant: G= 6.673 \times 10^{-11} m^3\: kg^{-1}\: s^{-2} = 6.673 \times 10^{-11} \frac{m^3}{kg\:s^2}

Force due to gravity: F = G \frac{M_1 M_2}{r^2}

Inline (or non-block) also works (e.g.,  2 + 2 = 4).

How It Works

When you run some content through the shortcode processor, it parses out individual shortcodes and passes their information to a handler. Shortcodes can be single tags or wrapper tags, and may be passed attributes. The LaTeX shortcode is a wrapper tag (with LaTeX code as the wrapped content).

In WordPress (in PHP), the handlers are function callbacks. However, using the same approach wouldn't have really worked with CFML (it doesn't have a global scope), so CFCs are used instead. You can use a function callback if you want, but it has a variety of problems (since they're just functions, not closures). The handler CFC has an 'execute' method accepting the shortcode attributes, body content, and name, along with a couple helper functions. It returns the string that should replace the shortcode in the result.

Consider this shortcode:

[latex block=false color=0000ff]\LaTeX[/latex]

It is basically equivalent to this method invocation:

writeOutput(shortcodes["latex"].execute(
	{block = false, color = "0000ff"},
	"\LaTeX",
	"latex",
	constrainAttributes, // for defaulting missing attributes, and removing extraneous ones
	processShortcodes // for recursively processing shortcodes in the body, if needed
));

Then we have latex.cfc's execute method, which looks something like this:

function execute(attrs, content, tagname, constrainAttributes) {
	var theUrl = "http://s.wordpress.com/latex.php?latex=";
	attrs = constrainAttributes({block = true, color = "000000"}, attrs);
	theUrl &= "#urlEncodedFormat(content)#&fg=#attrs.color#";
	return '<img src="#theUrl#" class="latex #attrs.block ? 'block' : 'inline'#" />';
}

The actual implementation is a bit more complex, of course, doing validation of attributes, LaTeX cleanup, etc., but that's the gist of it. When invoked, it'll return something like this:

<img src="http://s.wordpress.com/latex.php?latex=%5CLaTeX&bg=T&fg=0000ff&s=0" alt="\LaTeX" title="\LaTeX" class="latex inline" />

The 'time' shortcode is much simpler, and is implemented as a function callback:

function time_callback(attrs) {
	var defaults = { mask = "h:mm" };
	attrs = constrainAttributes(defaults, attrs); // already in scope for callbacks
	return timeFormat(now(), attrs.mask);
}

The last piece is actually wiring stuff together:

sc = createObject("component", "shortcodes").init();
sc.add("latex", createObject("component", "latex").init());
sc.add("time", time_callback);
resultContent = sc.process(rawContent);

The shortcodes implementation is stateless; it's designed to be initialized and assembled once and then reused for processing multiple pieces of content. Individual shortcode implementations should follow the same pattern.

The Code

The code is available at https://ssl.barneyb.com/svn/barneyb/shortcodes/trunk including the shortcodes implementation, the unit tests, this demo file, and the LaTeX shortcode implementation.

The unit tests require a small mod to the current version (1.0.8) of MXUnit to support non-test public methods on TestCases (see revision 1367 in SVN). Later version should have the mod already in place. You can also view the unit test output online.

Other Info

The "block hack" I mentioned above is a mod to the WP-LaTeX plugin that I ported along with the core that added the 'block' attribute to the shortcode for controlling the presence of a "block" class on the img (along with the "latex" class).

Note that the WP-LaTeX plugin, in addition to the shortcode, provides a raw content filter to support $latex ...$ syntax. This is not a shortcode, and therefore not supported by the port of the shortcode implementation.

There also seems to be a minor bug with shortcode escaping. It's sporadic, and seems to be due to some minor different in the regex engines between PHP and Java.

You can read all about shortcodes at the WordPress.org site's documentation page: http://codex.wordpress.org/Shortcode_API